I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.
You can leave a comment at the bottom of the page/chapter, or open an issue or submit a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook
Alternatively, you can leave an annotation using hypothes.is.
To add an annotation, select some text and then click the
symbol on the pop-up menu.
To see the annotations of others, click the
symbol in the upper right-hand corner of the page.
19 Machine Learning
19.1 Getting Started
19.1.1 Load Packages
19.1.2 Load Data
Code
# Downloaded Data - Processed
load(file = "./data/nfl_players.RData")
load(file = "./data/nfl_teams.RData")
load(file = "./data/nfl_rosters.RData")
load(file = "./data/nfl_rosters_weekly.RData")
load(file = "./data/nfl_schedules.RData")
load(file = "./data/nfl_combine.RData")
load(file = "./data/nfl_draftPicks.RData")
load(file = "./data/nfl_depthCharts.RData")
load(file = "./data/nfl_pbp.RData")
load(file = "./data/nfl_4thdown.RData")
load(file = "./data/nfl_participation.RData")
#load(file = "./data/nfl_actualFantasyPoints_weekly.RData")
load(file = "./data/nfl_injuries.RData")
load(file = "./data/nfl_snapCounts.RData")
load(file = "./data/nfl_espnQBR_seasonal.RData")
load(file = "./data/nfl_espnQBR_weekly.RData")
load(file = "./data/nfl_nextGenStats_weekly.RData")
load(file = "./data/nfl_advancedStatsPFR_seasonal.RData")
load(file = "./data/nfl_advancedStatsPFR_weekly.RData")
load(file = "./data/nfl_playerContracts.RData")
load(file = "./data/nfl_ftnCharting.RData")
load(file = "./data/nfl_playerIDs.RData")
load(file = "./data/nfl_rankings_draft.RData")
load(file = "./data/nfl_rankings_weekly.RData")
load(file = "./data/nfl_expectedFantasyPoints_weekly.RData")
load(file = "./data/nfl_expectedFantasyPoints_pbp.RData")
# Calculated Data - Processed
load(file = "./data/nfl_actualStats_career.RData")
load(file = "./data/nfl_actualStats_seasonal.RData")
load(file = "./data/player_stats_weekly.RData")
load(file = "./data/player_stats_seasonal.RData")19.1.3 Specify Options
19.2 Overview of Machine Learning
Machine learning takes us away from focusing on causal inference. Machine learning does not care about which processes are causal—i.e., which processes influence the outcome. Instead, machine learning cares about prediction—it cares about a predictor variable to the extent that it increases predictive accuracy regardless of whether it is causally related to the outcome.
Machine learning can be useful for leveraging big data and lots of predictor variable to develop predictive models with greater accuracy. However, many machine learning techniques are black boxes—it is often unclear how or why certain predictions are made, which can make it difficult to interpret the model’s decisions and understand the underlying relationships between variables. Machine learning tends to be a data-driven, atheoretical technique. This can result in overfitting. Thus, when estimating machine learning models, it is common to keep a hold-out sample for use in cross-validation to evaluate the extent of shrinkage of model coefficients. The data that the model is trained on is known as the “training data”. The data that the model was not trained on but is then is independently tested on—i.e., the hold-out sample—is the “test data”. Shrinkage occurs when predictor variables explain some random error variance in the original model. When the model is applied to an independent sample (i.e., the test data), the predictive model will likely not perform quite as well, and the regressions coefficients will tend to get smaller (i.e., shrink).
If the test data were collected as part of the same processes as the original data and were merely held out for purposes of analysis, this is called internal cross-validation. If the test data were collected separately from the original data used to train the model, this is called external cross-validation.
Most machine learning methods were developed with cross-sectional data in mind. That is, they assume that each person has only one observation on the outcome variable. However, with longitudinal data, each person has multiple observations on the outcome variable.
When performing machine learning, various approaches may help address this:
- transform data from long to wide form, so that each person has only one row
- when designing the training and test sets, keep all measurements from the same person in the same data object (either the training or test set); do not have some measurements from a given person in the training set and other measurements from the same person in the test set
- use a machine learning approach that accounts for the clustered/nested nature of the data
19.3 Types of Machine Learning
There are many approaches to machine learning. This chapter discusses several key ones:
- supervised learning
- continuous outcome (i.e., regression)
- linear regression
- lasso regression
- ridge regression
- elastic net regression
- categorical outcome (i.e., classification)
- logistic regression
- support vector machine
- random forest
- extreme gradient boosting
- continuous outcome (i.e., regression)
- unsupervised learning
- clustering
- principal component analysis
- semi-supervised learning
- reinforcement learning
- deep learning
- ensemble
Ensemble machine learning methods combine multiple machine learning approaches with the goal that combining multiple approaches might lead to more accurate predictions that any one method might be able to achieve on its own.
19.3.1 Supervised Learning
[DEFINE SUPERVISED LEARNING]
Unlike linear and logistic regression, various machine learning techniques can handle multicollinearity, including LASSO regression, ridge regression, and elastic net regression. Least absolute shrinkage and selection operator (LASSO) regression helps perform selection of which predictor variables to keep in the model by shrinking some coefficients to zero. Ridge regression shrinks the coefficients of predictor variables toward zero, but not to zero, so it does not perform selection of which predictor variables to retain; this allows it to allow nonzero coefficients for multiple correlated predictor variables in the context of multicollinearity. Elastic net involves a combination of LASSO and ridge regression; it performs selection of which predictor variables to keep by shrinking the coefficients of some predictor variables to zero, and it shrinks the coefficients of some predictor variables toward zero, to address multicollinearity.
Unless interactions or nonlinear terms are specified, linear, logistic, LASSO, ridge, and elastic net regresstion do not account for interactions among the predictor variables or for nonlinear associations between the predictor variables and the outcome variable. By contrast, random forests and extreme gradient boosting do account for interactions among the predictor variables and for nonlinear associations between the predictor variables and the outcome variable.
19.3.2 Unsupervised Learning
[DEFINE UNSUPERVISED LEARNING]
We describe cluster analysis in Chapter 21. We describe principal component analysis in Chapter 23.
19.3.3 Semi-supervised Learning
[DEFINE SEMI-SUPERVISED LEARNING]
19.3.4 Reinforcement Learning
[DEFINE REINFORCEMENT LEARNING]
19.4 Data Processing
19.4.1 Prepare Data for Merging
Code
# Prepare data for merging
#-todo: calculate years_of_experience
## Use common name for the same (gsis_id) ID variable
#nfl_actualFantasyPoints_player_weekly <- nfl_actualFantasyPoints_player_weekly %>%
# rename(gsis_id = player_id)
#
#nfl_actualFantasyPoints_player_seasonal <- nfl_actualFantasyPoints_player_seasonal %>%
# rename(gsis_id = player_id)
player_stats_seasonal_offense <- player_stats_seasonal %>%
filter(position_group %in% c("QB","RB","WR","TE")) %>%
rename(gsis_id = player_id)
player_stats_weekly_offense <- player_stats_weekly %>%
filter(position_group %in% c("QB","RB","WR","TE")) %>%
rename(gsis_id = player_id)
nfl_expectedFantasyPoints_weekly <- nfl_expectedFantasyPoints_weekly %>%
rename(gsis_id = player_id)
## Rename other variables to ensure common names
## Ensure variables with the same name have the same type
nfl_players <- nfl_players %>%
mutate(
birth_date = as.Date(birth_date),
jersey_number = as.character(jersey_number),
gsis_it_id = as.character(gsis_it_id),
years_of_experience = as.integer(years_of_experience))
player_stats_seasonal_offense <- player_stats_seasonal_offense %>%
mutate(
birth_date = as.Date(birth_date),
jersey_number = as.character(jersey_number),
gsis_it_id = as.character(gsis_it_id))
nfl_rosters <- nfl_rosters %>%
mutate(
draft_number = as.integer(draft_number))
nfl_rosters_weekly <- nfl_rosters_weekly %>%
mutate(
draft_number = as.integer(draft_number))
nfl_depthCharts <- nfl_depthCharts %>%
mutate(
season = as.integer(season))
nfl_expectedFantasyPoints_weekly <- nfl_expectedFantasyPoints_weekly %>%
mutate(
season = as.integer(season),
receptions = as.integer(receptions)) %>%
distinct(gsis_id, season, week, .keep_all = TRUE) # drop duplicated rows
## Rename variables
nfl_draftPicks <- nfl_draftPicks %>%
rename(
games_career = games,
pass_completions_career = pass_completions,
pass_attempts_career = pass_attempts,
pass_yards_career = pass_yards,
pass_tds_career = pass_tds,
pass_ints_career = pass_ints,
rush_atts_career = rush_atts,
rush_yards_career = rush_yards,
rush_tds_career = rush_tds,
receptions_career = receptions,
rec_yards_career = rec_yards,
rec_tds_career = rec_tds,
def_solo_tackles_career = def_solo_tackles,
def_ints_career = def_ints,
def_sacks_career = def_sacks
)
## Subset variables
nfl_expectedFantasyPoints_weekly <- nfl_expectedFantasyPoints_weekly %>%
select(gsis_id:position, contains("_exp"), contains("_diff"), contains("_team")) #drop "raw stats" variables (e.g., rec_yards_gained) so they don't get coalesced with actual stats
# Check duplicate ids
player_stats_seasonal_offense %>%
group_by(gsis_id, season) %>%
filter(n() > 1) %>%
head()Code
Identify objects with shared variable names:
[1] "gsis_id" "position"
[1] 21360
[1] 2855
[1] "gsis_id" "season" "team" "age"
[1] 14859
[1] 10395
[1] 14858
[1] 10395
[1] 14859
[1] 10325
[1] "gsis_id" "season" "week" "position" "full_name"
[1] 845134
[1] 100272
[1] 841942
[1] 100272
[1] 845101
[1] 97815
[1] 845118
[1] 97815
19.4.2 Merge Data
To merge data, we use the powerjoin package (Fabri, 2022):
Code
# Create lists of objects to merge, depending on data structure: id; or id-season; or id-season-week
#-todo: remove redundant variables
playerListToMerge <- list(
nfl_players %>% filter(!is.na(gsis_id)),
nfl_draftPicks %>% filter(!is.na(gsis_id)) %>% select(-season)
)
playerSeasonListToMerge <- list(
player_stats_seasonal_offense %>% filter(!is.na(gsis_id), !is.na(season)),
nfl_advancedStatsPFR_seasonal %>% filter(!is.na(gsis_id), !is.na(season))
)
playerSeasonWeekListToMerge <- list(
nfl_rosters_weekly %>% filter(!is.na(gsis_id), !is.na(season), !is.na(week)),
#nfl_actualStats_offense_weekly,
nfl_expectedFantasyPoints_weekly %>% filter(!is.na(gsis_id), !is.na(season), !is.na(week))
#nfl_advancedStatsPFR_weekly,
)
playerSeasonWeekPositionListToMerge <- list(
nfl_depthCharts %>% filter(!is.na(gsis_id), !is.na(season), !is.na(week))
)
# Merge data
playerMerged <- playerListToMerge %>%
reduce(
powerjoin::power_full_join,
by = c("gsis_id"),
conflict = coalesce_xy) # where the objects have the same variable name (e.g., position), keep the values from object 1, unless it's NA, in which case use the relevant value from object 2
playerSeasonMerged <- playerSeasonListToMerge %>%
reduce(
powerjoin::power_full_join,
by = c("gsis_id","season"),
conflict = coalesce_xy) # where the objects have the same variable name (e.g., team), keep the values from object 1, unless it's NA, in which case use the relevant value from object 2
playerSeasonWeekMerged <- playerSeasonWeekListToMerge %>%
reduce(
powerjoin::power_full_join,
by = c("gsis_id","season","week"),
conflict = coalesce_xy) # where the objects have the same variable name (e.g., position), keep the values from object 1, unless it's NA, in which case use the relevant value from object 2Identify objects with shared variable names:
[1] "gsis_id" "position"
[3] "position_group" "first_name"
[5] "last_name" "esb_id"
[7] "display_name" "rookie_year"
[9] "college_conference" "current_team_id"
[11] "draft_club" "draft_number"
[13] "draftround" "entry_year"
[15] "football_name" "gsis_it_id"
[17] "headshot" "jersey_number"
[19] "short_name" "smart_id"
[21] "status" "status_description_abbr"
[23] "status_short_description" "uniform_number"
[25] "height" "weight"
[27] "college_name" "birth_date"
[29] "suffix" "years_of_experience"
[31] "pfr_player_name" "team"
[33] "age"
Code
seasonalData <- powerjoin::power_full_join(
playerSeasonMerged,
playerMerged %>% select(-age, -years_of_experience, -team, -team_abbr, -team_seq, -current_team_id), # drop variables from id objects that change from year to year (and thus are not necessarily accurate for a given season)
by = "gsis_id",
conflict = coalesce_xy # where the objects have the same variable name (e.g., position), keep the values from object 1, unless it's NA, in which case use the relevant value from object 2
) %>%
filter(!is.na(season)) %>%
select(gsis_id, season, player_display_name, position, team, games, everything()) [1] "gsis_id" "season"
[3] "week" "team"
[5] "jersey_number" "status"
[7] "first_name" "last_name"
[9] "birth_date" "height"
[11] "weight" "college"
[13] "pfr_id" "headshot_url"
[15] "status_description_abbr" "football_name"
[17] "esb_id" "gsis_it_id"
[19] "smart_id" "entry_year"
[21] "rookie_year" "draft_club"
[23] "draft_number" "position"
Code
seasonalAndWeeklyData <- powerjoin::power_full_join(
playerSeasonWeekMerged,
seasonalData,
by = c("gsis_id","season"),
conflict = coalesce_xy # where the objects have the same variable name (e.g., position), keep the values from object 1, unless it's NA, in which case use the relevant value from object 2
) %>%
filter(!is.na(week)) %>%
select(gsis_id, season, week, full_name, position, team, everything())19.4.3 Additional Processing
19.4.4 Fill in Missing Data for Static Variables
19.4.5 Lag Fantasy Points
19.4.6 Subset to Predictor Variables and Outcome Variable
Code
dropVars <- c(
"birth_date", "loaded", "full_name", "player_name", "player_display_name", "display_name", "suffix", "headshot_url", "player", "pos",
"espn_id", "sportradar_id", "yahoo_id", "rotowire_id", "pff_id", "fantasy_data_id", "sleeper_id", "pfr_id",
"pfr_player_id", "cfb_player_id", "pfr_player_name", "esb_id", "gsis_it_id", "smart_id",
"college", "college_name", "team_abbr", "current_team_id", "college_conference", "draft_club", "status_description_abbr",
"status_short_description", "short_name", "headshot", "uniform_number", "jersey_number", "first_name", "last_name",
"football_name", "team")
seasonalData_lag_subset <- seasonalData_lag %>%
dplyr::select(-any_of(dropVars))19.4.7 Separate by Position
Code
seasonalData_lag_subsetQB <- seasonalData_lag_subset %>%
filter(position == "QB") %>%
select(
gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic,
height, weight, rookie_year, draft_number,
fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag,
completions:rushing_2pt_conversions, special_teams_tds, contains(".pass"), contains(".rush"))
seasonalData_lag_subsetRB <- seasonalData_lag_subset %>%
filter(position == "RB") %>%
select(
gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic,
height, weight, rookie_year, draft_number,
fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag,
carries:special_teams_tds, contains(".rush"), contains(".rec"))
seasonalData_lag_subsetWR <- seasonalData_lag_subset %>%
filter(position == "WR") %>%
select(
gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic,
height, weight, rookie_year, draft_number,
fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag,
carries:special_teams_tds, contains(".rush"), contains(".rec"))
seasonalData_lag_subsetTE <- seasonalData_lag_subset %>%
filter(position == "TE") %>%
select(
gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic,
height, weight, rookie_year, draft_number,
fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag,
carries:special_teams_tds, contains(".rush"), contains(".rec"))19.4.8 Split into Test and Training Data
Code
seasonalData_lag_qb_all <- seasonalData_lag_subsetQB
seasonalData_lag_rb_all <- seasonalData_lag_subsetRB
seasonalData_lag_wr_all <- seasonalData_lag_subsetWR
seasonalData_lag_te_all <- seasonalData_lag_subsetTE
set.seed(52242) # for reproducibility (to keep the same train/holdout players)
activeQBs <- unique(seasonalData_lag_qb_all$gsis_id[which(seasonalData_lag_qb_all$season == max(seasonalData_lag_qb_all$season, na.rm = TRUE))])
retiredQBs <- unique(seasonalData_lag_qb_all$gsis_id[which(seasonalData_lag_qb_all$gsis_id %ni% activeQBs)])
numQBs <- length(unique(seasonalData_lag_qb_all$gsis_id))
qbHoldoutIDs <- sample(retiredQBs, size = ceiling(.2 * numQBs)) # holdout 20% of players
activeRBs <- unique(seasonalData_lag_rb_all$gsis_id[which(seasonalData_lag_rb_all$season == max(seasonalData_lag_rb_all$season, na.rm = TRUE))])
retiredRBs <- unique(seasonalData_lag_rb_all$gsis_id[which(seasonalData_lag_rb_all$gsis_id %ni% activeRBs)])
numRBs <- length(unique(seasonalData_lag_rb_all$gsis_id))
rbHoldoutIDs <- sample(retiredRBs, size = ceiling(.2 * numRBs)) # holdout 20% of players
set.seed(52242) # for reproducibility (to keep the same train/holdout players); added here to prevent a downstream error with predict.missRanger() due to missingness; this suggests that an error can arise from including a player in the holdout sample who has missingness in particular variables; would be good to identify which player(s) in the holdout sample evoke that error to identify the kinds of missingness that yield the error
activeWRs <- unique(seasonalData_lag_wr_all$gsis_id[which(seasonalData_lag_wr_all$season == max(seasonalData_lag_wr_all$season, na.rm = TRUE))])
retiredWRs <- unique(seasonalData_lag_wr_all$gsis_id[which(seasonalData_lag_wr_all$gsis_id %ni% activeWRs)])
numWRs <- length(unique(seasonalData_lag_wr_all$gsis_id))
wrHoldoutIDs <- sample(retiredWRs, size = ceiling(.2 * numWRs)) # holdout 20% of players
activeTEs <- unique(seasonalData_lag_te_all$gsis_id[which(seasonalData_lag_te_all$season == max(seasonalData_lag_te_all$season, na.rm = TRUE))])
retiredTEs <- unique(seasonalData_lag_te_all$gsis_id[which(seasonalData_lag_te_all$gsis_id %ni% activeTEs)])
numTEs <- length(unique(seasonalData_lag_te_all$gsis_id))
teHoldoutIDs <- sample(retiredTEs, size = ceiling(.2 * numTEs)) # holdout 20% of players
seasonalData_lag_qb_train <- seasonalData_lag_qb_all %>%
filter(gsis_id %ni% qbHoldoutIDs)
seasonalData_lag_qb_test <- seasonalData_lag_qb_all %>%
filter(gsis_id %in% qbHoldoutIDs)
seasonalData_lag_rb_train <- seasonalData_lag_rb_all %>%
filter(gsis_id %ni% rbHoldoutIDs)
seasonalData_lag_rb_test <- seasonalData_lag_rb_all %>%
filter(gsis_id %in% rbHoldoutIDs)
seasonalData_lag_wr_train <- seasonalData_lag_wr_all %>%
filter(gsis_id %ni% wrHoldoutIDs)
seasonalData_lag_wr_test <- seasonalData_lag_wr_all %>%
filter(gsis_id %in% wrHoldoutIDs)
seasonalData_lag_te_train <- seasonalData_lag_te_all %>%
filter(gsis_id %ni% teHoldoutIDs)
seasonalData_lag_te_test <- seasonalData_lag_te_all %>%
filter(gsis_id %in% teHoldoutIDs)19.4.9 Impute the Missing Data
Here is a vignette demonstrating how to impute missing data using missForest(): https://rpubs.com/lmorgan95/MissForest (archived at: https://perma.cc/6GB4-2E22). Below, we impute the training data (and all data) separately by position. We then use the imputed training data to make out-of-sample predictions to fill in the missing data for the testing data. We do not want to impute the training and testing data together so that we can keep them separate for the purposes of cross-validation. However, we impute all data (training and test data together) for purposes of making out-of-sample predictions from the machine learning models to predict players’ performance next season (when actuals are not yet available for evaluating their accuracy). To impute data, we use the missRanger package (Mayer, 2024).
Note: the following code takes a while to run.
Code
Variables to impute: fantasy_points, fantasy_points_ppr, special_teams_tds, passing_epa, pacr, rushing_epa, fantasyPoints_lag, passing_cpoe, rookie_year, draft_number, gs, pass_attempts.pass, throwaways.pass, spikes.pass, drops.pass, bad_throws.pass, times_blitzed.pass, times_hurried.pass, times_hit.pass, times_pressured.pass, batted_balls.pass, on_tgt_throws.pass, rpo_plays.pass, rpo_yards.pass, rpo_pass_att.pass, rpo_pass_yards.pass, rpo_rush_att.pass, rpo_rush_yards.pass, pa_pass_att.pass, pa_pass_yards.pass, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, drop_pct.pass, bad_throw_pct.pass, on_tgt_pct.pass, pressure_pct.pass, ybc_att.rush, yac_att.rush, pocket_time.pass
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, completions, attempts, passing_yards, passing_tds, passing_interceptions, sacks_suffered, sack_yards_lost, sack_fumbles, sack_fumbles_lost, passing_air_yards, passing_yards_after_catch, passing_first_downs, passing_epa, passing_cpoe, passing_2pt_conversions, pacr, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, special_teams_tds, pocket_time.pass, pass_attempts.pass, throwaways.pass, spikes.pass, drops.pass, bad_throws.pass, times_blitzed.pass, times_hurried.pass, times_hit.pass, times_pressured.pass, batted_balls.pass, on_tgt_throws.pass, rpo_plays.pass, rpo_yards.pass, rpo_pass_att.pass, rpo_pass_yards.pass, rpo_rush_att.pass, rpo_rush_yards.pass, pa_pass_att.pass, pa_pass_yards.pass, drop_pct.pass, bad_throw_pct.pass, on_tgt_pct.pass, pressure_pct.pass, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush
fntsy_ fnts__ spcl__ pssng_p pacr rshng_ fntsP_ pssng_c rok_yr drft_n gs pss_t. thrww. spks.p drps.p bd_th. tms_b. tms_hr. tms_ht. tms_p. bttd_. on_tgt_t. rp_pl. rp_yr. rp_pss_t. rp_pss_y. rp_rsh_t. rp_rsh_y. p_pss_t. p_pss_y. att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_t. att_b. drp_p. bd_t_. on_tgt_p. prss_. ybc_t. yc_tt. pckt_.
iter 1: 0.0054 0.0024 0.7924 0.1919 0.7612 0.3628 0.4789 0.4133 0.0224 0.5216 0.0271 0.0134 0.3024 0.7659 0.1304 0.0541 0.0758 0.1759 0.1820 0.0370 0.3238 0.0291 0.2952 0.1812 0.0885 0.0867 0.2627 0.2563 0.1093 0.0902 0.0580 0.0645 0.1732 0.0524 0.0578 0.1795 0.3524 0.3428 0.7447 0.5158 0.0824 0.6803 0.3529 0.5758 0.8111
iter 2: 0.0044 0.0048 0.8304 0.2002 0.7926 0.3736 0.4801 0.4289 0.0488 0.6139 0.0188 0.0090 0.2883 0.7481 0.0764 0.0385 0.0718 0.1231 0.1329 0.0337 0.2760 0.0113 0.0548 0.0814 0.0765 0.0990 0.1989 0.2841 0.0707 0.0952 0.0396 0.0386 0.1606 0.0492 0.0525 0.1220 0.2541 0.3556 0.7468 0.4937 0.0827 0.6610 0.3465 0.5796 0.8134
iter 3: 0.0049 0.0046 0.8690 0.1986 0.7810 0.3641 0.4774 0.4360 0.0528 0.6123 0.0188 0.0088 0.2867 0.7538 0.0767 0.0393 0.0734 0.1261 0.1374 0.0343 0.2741 0.0119 0.0524 0.0816 0.0748 0.1008 0.2184 0.2811 0.0691 0.0926 0.0389 0.0413 0.1640 0.0511 0.0585 0.1255 0.2510 0.3609 0.7477 0.5108 0.0858 0.6426 0.3588 0.5734 0.8300
missRanger object. Extract imputed data via $data
- best iteration: 2
- best average OOB imputation error: 0.2524825
Code
data_all_qb <- seasonalData_lag_qb_all_imp$data
data_all_qb_matrix <- data_all_qb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
newData_qb <- data_all_qb %>%
filter(season == max(season, na.rm = TRUE)) %>%
select(-fantasyPoints_lag)
newData_qb_matrix <- data_all_qb_matrix[
data_all_qb_matrix[, "season"] == max(data_all_qb_matrix[, "season"], na.rm = TRUE), # keep only rows with the most recent season
, # all columns
drop = FALSE]
dropCol_qb <- which(colnames(newData_qb_matrix) == "fantasyPoints_lag")
newData_qb_matrix <- newData_qb_matrix[, -dropCol_qb, drop = FALSE]
seasonalData_lag_qb_train_imp <- missRanger::missRanger(
seasonalData_lag_qb_train,
pmm.k = 5,
verbose = 2,
seed = 52242,
keep_forests = TRUE)
Variables to impute: fantasy_points, fantasy_points_ppr, special_teams_tds, passing_epa, pacr, rushing_epa, fantasyPoints_lag, passing_cpoe, rookie_year, draft_number, gs, pass_attempts.pass, throwaways.pass, spikes.pass, drops.pass, bad_throws.pass, times_blitzed.pass, times_hurried.pass, times_hit.pass, times_pressured.pass, batted_balls.pass, on_tgt_throws.pass, rpo_plays.pass, rpo_yards.pass, rpo_pass_att.pass, rpo_pass_yards.pass, rpo_rush_att.pass, rpo_rush_yards.pass, pa_pass_att.pass, pa_pass_yards.pass, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, drop_pct.pass, bad_throw_pct.pass, on_tgt_pct.pass, pressure_pct.pass, ybc_att.rush, yac_att.rush, pocket_time.pass
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, completions, attempts, passing_yards, passing_tds, passing_interceptions, sacks_suffered, sack_yards_lost, sack_fumbles, sack_fumbles_lost, passing_air_yards, passing_yards_after_catch, passing_first_downs, passing_epa, passing_cpoe, passing_2pt_conversions, pacr, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, special_teams_tds, pocket_time.pass, pass_attempts.pass, throwaways.pass, spikes.pass, drops.pass, bad_throws.pass, times_blitzed.pass, times_hurried.pass, times_hit.pass, times_pressured.pass, batted_balls.pass, on_tgt_throws.pass, rpo_plays.pass, rpo_yards.pass, rpo_pass_att.pass, rpo_pass_yards.pass, rpo_rush_att.pass, rpo_rush_yards.pass, pa_pass_att.pass, pa_pass_yards.pass, drop_pct.pass, bad_throw_pct.pass, on_tgt_pct.pass, pressure_pct.pass, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush
fntsy_ fnts__ spcl__ pssng_p pacr rshng_ fntsP_ pssng_c rok_yr drft_n gs pss_t. thrww. spks.p drps.p bd_th. tms_b. tms_hr. tms_ht. tms_p. bttd_. on_tgt_t. rp_pl. rp_yr. rp_pss_t. rp_pss_y. rp_rsh_t. rp_rsh_y. p_pss_t. p_pss_y. att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_t. att_b. drp_p. bd_t_. on_tgt_p. prss_. ybc_t. yc_tt. pckt_.
iter 1: 0.0061 0.0028 0.8162 0.1897 0.5083 0.3633 0.4726 0.4456 0.0242 0.4723 0.0283 0.0141 0.2939 0.7728 0.1343 0.0558 0.0744 0.1757 0.1818 0.0381 0.3288 0.0351 0.2921 0.1846 0.0860 0.0894 0.2737 0.2661 0.1127 0.0900 0.0586 0.0644 0.1800 0.0574 0.0639 0.1792 0.3570 0.3486 0.7646 0.5313 0.0868 0.7084 0.3533 0.5933 0.8466
iter 2: 0.0052 0.0052 0.8304 0.1937 0.5621 0.3715 0.4614 0.4586 0.0505 0.5647 0.0192 0.0092 0.2953 0.7530 0.0800 0.0393 0.0725 0.1170 0.1355 0.0343 0.2771 0.0121 0.0555 0.0731 0.0713 0.0979 0.2073 0.2943 0.0698 0.0911 0.0416 0.0399 0.1683 0.0527 0.0577 0.1262 0.2474 0.3582 0.7719 0.5165 0.0900 0.6862 0.3642 0.5926 0.8400
iter 3: 0.0053 0.0051 0.8261 0.2008 0.5551 0.3571 0.4727 0.4410 0.0551 0.5658 0.0188 0.0092 0.2859 0.7460 0.0807 0.0402 0.0739 0.1202 0.1393 0.0351 0.2808 0.0114 0.0595 0.0705 0.0775 0.1051 0.2163 0.2935 0.0718 0.0921 0.0426 0.0400 0.1719 0.0535 0.0534 0.1225 0.2498 0.3484 0.7502 0.5100 0.0884 0.6609 0.3672 0.5852 0.8440
iter 4: 0.0054 0.0051 0.6928 0.1979 0.5598 0.3732 0.4771 0.4349 0.0506 0.5691 0.0189 0.0085 0.2891 0.7456 0.0785 0.0395 0.0737 0.1210 0.1353 0.0335 0.2836 0.0117 0.0566 0.0778 0.0743 0.1055 0.2131 0.2964 0.0697 0.0912 0.0396 0.0395 0.1611 0.0531 0.0597 0.1258 0.2600 0.3560 0.8062 0.5032 0.0973 0.6739 0.3698 0.5875 0.8485
iter 5: 0.0052 0.0055 0.8355 0.1965 0.5664 0.3710 0.4743 0.4604 0.0520 0.5598 0.0193 0.0091 0.2852 0.7474 0.0800 0.0405 0.0722 0.1213 0.1366 0.0344 0.2788 0.0118 0.0555 0.0756 0.0746 0.0986 0.2190 0.2765 0.0695 0.0932 0.0390 0.0425 0.1650 0.0509 0.0576 0.1305 0.2556 0.3509 0.7738 0.5051 0.0969 0.6902 0.3640 0.6007 0.8326
missRanger object. Extract imputed data via $data
- best iteration: 4
- best average OOB imputation error: 0.2482278
Code
data_train_qb <- seasonalData_lag_qb_train_imp$data
data_train_qb_matrix <- data_train_qb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
seasonalData_lag_qb_test_imp <- predict(
object = seasonalData_lag_qb_train_imp,
newdata = seasonalData_lag_qb_test,
seed = 52242)
data_test_qb <- seasonalData_lag_qb_test_imp
data_test_qb_matrix <- data_test_qb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()Code
Variables to impute: games, ageCentered20, ageCentered20Quadratic, fantasy_points, fantasy_points_ppr, fantasyPoints, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_2pt_conversions, special_teams_tds, years_of_experience, rushing_epa, air_yards_share, receiving_epa, racr, target_share, wopr, fantasyPoints_lag, rookie_year, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, ybc_att.rush, yac_att.rush, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
games agCn20 agC20Q fntsy_ fnts__ fntsyP carris rshng_y rshng_t rshng_f rshng_fm_ rshng_fr_ rsh_2_ rcptns targts rcvng_y rcvng_t rcvng_f rcvng_fm_ rcvng_r_ rcv___ rcvng_fr_ rcv_2_ spcl__ yrs_f_ rshng_p ar_yr_ rcvng_p racr trgt_s wopr fntsP_ rok_yr drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc ybc_t. yc_tt. adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r
iter 1: 0.8865 0.0057 0.0031 0.4544 0.0178 0.0032 0.0745 0.0233 0.1462 0.4895 0.2594 0.0295 0.9849 0.0690 0.0666 0.0534 0.4327 0.8626 0.4824 0.6841 0.0322 0.0614 1.0171 0.8263 0.1817 0.4512 0.3321 0.3894 0.5211 0.4549 0.1817 0.5440 0.0197 0.5999 0.1700 0.0244 0.0222 0.0792 0.0297 0.0527 0.0520 0.2134 0.3431 0.0252 0.0180 0.0257 0.1634 0.0437 0.3108 0.0217 0.3925 0.4610 0.6941 0.4880 0.5402 0.2670 0.2026 0.3482 0.1596 0.2698 0.3637
iter 2: 0.2755 0.0162 0.0207 0.0063 0.0037 0.0044 0.0161 0.0148 0.0912 0.2524 0.2898 0.0248 0.9832 0.0273 0.0444 0.0233 0.2065 0.4600 0.4891 0.1285 0.0332 0.0457 1.0175 0.8566 0.1824 0.4212 0.2329 0.3099 0.5605 0.2569 0.1742 0.5373 0.0424 0.6377 0.1653 0.0167 0.0123 0.0859 0.0302 0.0367 0.0349 0.1030 0.3689 0.0144 0.0159 0.0195 0.1403 0.0434 0.1549 0.0190 0.3840 0.1050 0.5453 0.4882 0.5616 0.2472 0.1953 0.1525 0.1687 0.2595 0.3619
iter 3: 0.2744 0.0163 0.0231 0.0062 0.0038 0.0047 0.0152 0.0137 0.0980 0.2601 0.2906 0.0244 0.9800 0.0265 0.0347 0.0231 0.2101 0.4638 0.4954 0.1284 0.0283 0.0458 1.0114 0.8731 0.1818 0.4117 0.2278 0.3037 0.5699 0.2052 0.1800 0.5389 0.0400 0.6423 0.1624 0.0166 0.0124 0.0893 0.0306 0.0374 0.0356 0.1074 0.3628 0.0144 0.0163 0.0187 0.1390 0.0463 0.1583 0.0190 0.3882 0.1062 0.5642 0.4796 0.5570 0.2380 0.1935 0.1586 0.1588 0.2625 0.3648
iter 4: 0.2776 0.0169 0.0220 0.0063 0.0038 0.0045 0.0151 0.0138 0.0979 0.2584 0.2846 0.0243 0.9782 0.0263 0.0281 0.0221 0.1968 0.4594 0.4817 0.1267 0.0290 0.0462 1.0104 0.8614 0.1854 0.4216 0.2333 0.3004 0.5467 0.1917 0.1815 0.5353 0.0443 0.6503 0.1657 0.0166 0.0121 0.0905 0.0313 0.0378 0.0357 0.1041 0.3437 0.0155 0.0159 0.0185 0.1405 0.0441 0.1613 0.0196 0.3816 0.1117 0.5682 0.5011 0.5585 0.2421 0.1975 0.1520 0.1770 0.2650 0.3647
iter 5: 0.2752 0.0163 0.0226 0.0063 0.0038 0.0045 0.0158 0.0138 0.1015 0.2614 0.2857 0.0242 0.9740 0.0250 0.0303 0.0218 0.2004 0.4607 0.4810 0.1167 0.0285 0.0449 1.0077 0.8658 0.1835 0.4182 0.2170 0.2995 0.5690 0.2010 0.1794 0.5375 0.0385 0.6487 0.1652 0.0166 0.0124 0.0878 0.0306 0.0368 0.0353 0.1069 0.3539 0.0154 0.0159 0.0193 0.1409 0.0447 0.1598 0.0205 0.3873 0.1062 0.5583 0.4895 0.5501 0.2418 0.1979 0.1713 0.1726 0.2625 0.3596
iter 6: 0.2760 0.0158 0.0223 0.0063 0.0037 0.0046 0.0150 0.0144 0.0982 0.2568 0.2816 0.0238 0.9810 0.0253 0.0273 0.0223 0.2141 0.4606 0.4881 0.1386 0.0300 0.0457 1.0174 0.8605 0.1821 0.4188 0.2263 0.2985 0.5497 0.1779 0.1536 0.5388 0.0389 0.6422 0.1668 0.0162 0.0119 0.0897 0.0305 0.0376 0.0356 0.1066 0.3529 0.0149 0.0159 0.0196 0.1446 0.0450 0.1585 0.0197 0.3857 0.1001 0.5607 0.4948 0.5478 0.2487 0.1945 0.1438 0.1543 0.2568 0.3612
iter 7: 0.2748 0.0158 0.0212 0.0064 0.0039 0.0047 0.0149 0.0141 0.0986 0.2611 0.2877 0.0241 0.9755 0.0253 0.0310 0.0223 0.2163 0.4553 0.4885 0.1335 0.0293 0.0456 1.0096 0.8575 0.1821 0.4236 0.2203 0.2998 0.5510 0.2107 0.1797 0.5354 0.0416 0.6395 0.1646 0.0166 0.0117 0.0895 0.0310 0.0371 0.0361 0.1073 0.3547 0.0154 0.0156 0.0193 0.1410 0.0449 0.1628 0.0201 0.3892 0.1076 0.5609 0.4946 0.5643 0.2392 0.1899 0.1540 0.1423 0.2667 0.3603
missRanger object. Extract imputed data via $data
- best iteration: 6
- best average OOB imputation error: 0.2175486
Code
data_all_rb <- seasonalData_lag_rb_all_imp$data
data_all_rb_matrix <- data_all_rb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
newData_rb <- data_all_rb %>%
filter(season == max(season, na.rm = TRUE)) %>%
select(-fantasyPoints_lag)
newData_rb_matrix <- data_all_rb_matrix[
data_all_rb_matrix[, "season"] == max(data_all_rb_matrix[, "season"], na.rm = TRUE), # keep only rows with the most recent season
, # all columns
drop = FALSE]
dropCol_rb <- which(colnames(newData_rb_matrix) == "fantasyPoints_lag")
newData_rb_matrix <- newData_rb_matrix[, -dropCol_rb, drop = FALSE]
seasonalData_lag_rb_train_imp <- missRanger::missRanger(
seasonalData_lag_rb_train,
pmm.k = 5,
verbose = 2,
seed = 52242,
keep_forests = TRUE)
Variables to impute: games, ageCentered20, ageCentered20Quadratic, fantasy_points, fantasy_points_ppr, fantasyPoints, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_2pt_conversions, special_teams_tds, years_of_experience, rushing_epa, air_yards_share, receiving_epa, racr, target_share, wopr, fantasyPoints_lag, rookie_year, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, ybc_att.rush, yac_att.rush, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
games agCn20 agC20Q fntsy_ fnts__ fntsyP carris rshng_y rshng_t rshng_f rshng_fm_ rshng_fr_ rsh_2_ rcptns targts rcvng_y rcvng_t rcvng_f rcvng_fm_ rcvng_r_ rcv___ rcvng_fr_ rcv_2_ spcl__ yrs_f_ rshng_p ar_yr_ rcvng_p racr trgt_s wopr fntsP_ rok_yr drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc ybc_t. yc_tt. adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r
iter 1: 0.8759 0.0072 0.0036 0.4578 0.0178 0.0035 0.0736 0.0229 0.1524 0.4776 0.2679 0.0288 0.9965 0.0749 0.0744 0.0553 0.4578 0.8604 0.4998 0.6821 0.0360 0.0639 1.0042 0.8380 0.1806 0.4662 0.3419 0.3961 0.5595 0.4715 0.1968 0.5338 0.0246 0.5882 0.1726 0.0265 0.0235 0.0849 0.0311 0.0550 0.0521 0.2131 0.3689 0.0281 0.0197 0.0286 0.1742 0.0463 0.3114 0.0229 0.3942 0.4806 0.7239 0.5199 0.5631 0.2865 0.2195 0.3630 0.2052 0.2596 0.4091
iter 2: 0.2745 0.0177 0.0266 0.0067 0.0041 0.0049 0.0169 0.0154 0.1017 0.2590 0.2956 0.0237 0.9814 0.0286 0.0522 0.0240 0.2187 0.4582 0.4919 0.1541 0.0362 0.0475 1.0075 0.8811 0.1822 0.4481 0.2377 0.3184 0.6116 0.2628 0.2007 0.5254 0.0473 0.6411 0.1653 0.0179 0.0132 0.0940 0.0325 0.0392 0.0375 0.1057 0.3678 0.0161 0.0169 0.0196 0.1510 0.0484 0.1521 0.0202 0.3941 0.1035 0.5567 0.5120 0.5666 0.2466 0.2087 0.1733 0.1699 0.2524 0.4066
iter 3: 0.2766 0.0190 0.0273 0.0067 0.0041 0.0048 0.0159 0.0149 0.0971 0.2615 0.2952 0.0245 0.9668 0.0278 0.0409 0.0240 0.2180 0.4648 0.4931 0.1319 0.0350 0.0495 1.0128 0.8907 0.1820 0.4366 0.2459 0.3124 0.6236 0.2555 0.2114 0.5276 0.0438 0.6314 0.1658 0.0175 0.0122 0.0899 0.0319 0.0386 0.0386 0.1100 0.3783 0.0155 0.0165 0.0194 0.1477 0.0474 0.1499 0.0194 0.3929 0.1121 0.5761 0.5245 0.5651 0.2490 0.2103 0.1767 0.1817 0.2658 0.4106
missRanger object. Extract imputed data via $data
- best iteration: 2
- best average OOB imputation error: 0.226086
Code
data_train_rb <- seasonalData_lag_rb_train_imp$data
data_train_rb_matrix <- data_train_rb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
seasonalData_lag_rb_test_imp <- predict(
object = seasonalData_lag_rb_train_imp,
newdata = seasonalData_lag_rb_test,
seed = 52242)
data_test_rb <- seasonalData_lag_rb_test_imp
data_test_rb_matrix <- data_test_rb %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()Code
Variables to impute: fantasy_points, fantasy_points_ppr, special_teams_tds, years_of_experience, receiving_epa, racr, air_yards_share, target_share, wopr, fantasyPoints_lag, rookie_year, rushing_epa, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec, ybc_att.rush, yac_att.rush
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
fntsy_ fnts__ spcl__ yrs_f_ rcvng_ racr ar_yr_ trgt_s wopr fntsP_ rok_yr rshng_ drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r ybc_t. yc_tt.
iter 1: 0.0061 0.0010 0.7104 0.1566 0.1040 0.8131 0.1013 0.1722 0.0402 0.4890 0.0150 0.3811 0.6654 0.1459 0.1184 0.0898 0.2353 0.1234 0.0966 0.2670 0.6383 0.3084 0.0198 0.0136 0.0151 0.0671 0.0135 0.0268 0.0442 0.4465 0.4320 0.4674 0.2961 0.1410 0.3819 0.1840 0.2251 0.3929 0.2568 0.4760
iter 2: 0.0058 0.0019 0.7826 0.1601 0.0835 0.7518 0.0607 0.0930 0.0452 0.4939 0.0296 0.3301 0.6843 0.1440 0.0851 0.0600 0.2638 0.1161 0.0708 0.1804 0.3103 0.3223 0.0109 0.0108 0.0096 0.0719 0.0139 0.0200 0.0318 0.4476 0.0778 0.3692 0.2401 0.1448 0.1629 0.1601 0.2261 0.3793 0.2536 0.4775
iter 3: 0.0061 0.0019 0.7857 0.1593 0.0829 0.7421 0.0580 0.0986 0.0481 0.4946 0.0318 0.3334 0.6890 0.1430 0.0823 0.0604 0.2595 0.1177 0.0728 0.1802 0.3077 0.3194 0.0109 0.0114 0.0095 0.0724 0.0133 0.0199 0.0312 0.4411 0.0767 0.3687 0.2369 0.1455 0.1530 0.1660 0.2169 0.3878 0.2466 0.4716
iter 4: 0.0060 0.0018 0.7874 0.1604 0.0832 0.7394 0.0591 0.0940 0.0479 0.4926 0.0301 0.3317 0.6896 0.1434 0.0863 0.0601 0.2562 0.1227 0.0711 0.1900 0.3089 0.3194 0.0105 0.0112 0.0095 0.0707 0.0140 0.0202 0.0318 0.4447 0.0784 0.3674 0.2339 0.1423 0.1592 0.1700 0.2254 0.3886 0.2552 0.4662
missRanger object. Extract imputed data via $data
- best iteration: 3
- best average OOB imputation error: 0.203846
Code
data_all_wr <- seasonalData_lag_wr_all_imp$data
data_all_wr_matrix <- data_all_wr %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
newData_wr <- data_all_wr %>%
filter(season == max(season, na.rm = TRUE)) %>%
select(-fantasyPoints_lag)
newData_wr_matrix <- data_all_wr_matrix[
data_all_wr_matrix[, "season"] == max(data_all_wr_matrix[, "season"], na.rm = TRUE), # keep only rows with the most recent season
, # all columns
drop = FALSE]
dropCol_wr <- which(colnames(newData_wr_matrix) == "fantasyPoints_lag")
newData_wr_matrix <- newData_wr_matrix[, -dropCol_wr, drop = FALSE]
seasonalData_lag_wr_train_imp <- missRanger::missRanger(
seasonalData_lag_wr_train,
pmm.k = 5,
verbose = 2,
seed = 52242,
keep_forests = TRUE)
Variables to impute: fantasy_points, fantasy_points_ppr, special_teams_tds, years_of_experience, receiving_epa, racr, air_yards_share, target_share, wopr, fantasyPoints_lag, rookie_year, rushing_epa, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec, ybc_att.rush, yac_att.rush
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
fntsy_ fnts__ spcl__ yrs_f_ rcvng_ racr ar_yr_ trgt_s wopr fntsP_ rok_yr rshng_ drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r ybc_t. yc_tt.
iter 1: 0.0064 0.0010 0.7029 0.1611 0.1089 0.8443 0.1021 0.1643 0.0427 0.4935 0.0173 0.3461 0.6788 0.1427 0.1364 0.0993 0.2403 0.1243 0.0979 0.2745 0.6190 0.3171 0.0201 0.0147 0.0159 0.0734 0.0140 0.0280 0.0454 0.4502 0.4439 0.4733 0.3088 0.1641 0.4547 0.2192 0.2439 0.4227 0.2921 0.5068
iter 2: 0.0063 0.0020 0.7835 0.1630 0.0901 0.8044 0.0674 0.0936 0.0479 0.4930 0.0331 0.3235 0.7090 0.1417 0.0896 0.0659 0.2659 0.1273 0.0752 0.1920 0.3068 0.3225 0.0112 0.0116 0.0101 0.0753 0.0141 0.0210 0.0333 0.4431 0.0809 0.3676 0.2571 0.1617 0.1735 0.1797 0.2441 0.3996 0.2911 0.4923
iter 3: 0.0063 0.0020 0.7710 0.1639 0.0881 0.7954 0.0646 0.0982 0.0515 0.4956 0.0338 0.3200 0.7088 0.1413 0.0900 0.0640 0.2565 0.1250 0.0735 0.1989 0.3082 0.3280 0.0114 0.0119 0.0096 0.0763 0.0141 0.0216 0.0326 0.4388 0.0807 0.3703 0.2582 0.1623 0.1657 0.2018 0.2375 0.4016 0.2838 0.4794
iter 4: 0.0062 0.0020 0.7792 0.1625 0.0877 0.8043 0.0632 0.0919 0.0477 0.4963 0.0341 0.3239 0.7038 0.1420 0.0950 0.0653 0.2664 0.1309 0.0767 0.2025 0.2933 0.3076 0.0109 0.0119 0.0097 0.0745 0.0143 0.0217 0.0326 0.4432 0.0804 0.3688 0.2584 0.1605 0.1931 0.2013 0.2378 0.4119 0.2824 0.4860
missRanger object. Extract imputed data via $data
- best iteration: 3
- best average OOB imputation error: 0.2110456
Code
data_train_wr <- seasonalData_lag_wr_train_imp$data
data_train_wr_matrix <- data_train_wr %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
seasonalData_lag_wr_test_imp <- predict(
object = seasonalData_lag_wr_train_imp,
newdata = seasonalData_lag_wr_test,
seed = 52242)
data_test_wr <- seasonalData_lag_wr_test_imp
data_test_wr_matrix <- data_test_wr %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()Code
Variables to impute: games, ageCentered20, ageCentered20Quadratic, fantasy_points, fantasy_points_ppr, fantasyPoints, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_2pt_conversions, special_teams_tds, years_of_experience, receiving_epa, racr, air_yards_share, target_share, wopr, fantasyPoints_lag, rookie_year, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec, rushing_epa, ybc_att.rush, yac_att.rush
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
games agCn20 agC20Q fntsy_ fnts__ fntsyP carris rshng_y rshng_t rshng_f rshng_fm_ rshng_fr_ rsh_2_ rcptns targts rcvng_y rcvng_t rcvng_f rcvng_fm_ rcvng_r_ rcv___ rcvng_fr_ rcv_2_ spcl__ yrs_f_ rcvng_p racr ar_yr_ trgt_s wopr fntsP_ rok_yr drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r rshng_p ybc_t. yc_tt.
iter 1: 0.8157 0.0061 0.0030 0.3406 0.0194 0.0039 0.5253 0.2259 0.2452 0.7083 0.6874 0.0802 1.1303 0.0281 0.0558 0.0255 0.0845 0.8134 0.4317 0.0655 0.0784 0.0253 0.9716 1.0271 0.1530 0.1689 0.6899 0.1092 0.4432 0.1004 0.4764 0.0180 0.6054 0.3846 0.0762 0.0832 0.1564 0.0618 0.0704 0.2123 0.3921 0.6733 0.0290 0.0207 0.0226 0.1012 0.0212 0.0420 0.0603 0.4332 0.4640 0.4996 0.2804 0.1667 0.3542 0.1652 0.2843 0.3948 0.3270 0.6620 0.7439
iter 2: 0.1712 0.0175 0.0256 0.0106 0.0037 0.0055 0.1140 0.1113 0.0990 0.5369 0.7422 0.0852 1.1286 0.0193 0.0200 0.0128 0.0862 0.4248 0.4659 0.0206 0.0529 0.0217 0.9711 1.0114 0.1561 0.1397 0.6715 0.0766 0.1819 0.1085 0.4649 0.0366 0.6346 0.3880 0.0722 0.0728 0.1592 0.0680 0.0759 0.2034 0.3651 0.6811 0.0164 0.0158 0.0161 0.1080 0.0211 0.0327 0.0475 0.4342 0.1149 0.4173 0.2589 0.1742 0.1467 0.1531 0.2941 0.3851 0.3357 0.6846 0.7397
iter 3: 0.1689 0.0170 0.0261 0.0114 0.0040 0.0056 0.1190 0.1155 0.0978 0.6088 0.7899 0.0945 1.1731 0.0195 0.0203 0.0132 0.0964 0.4270 0.4608 0.0202 0.0525 0.0214 0.9694 1.0265 0.1560 0.1380 0.6453 0.0751 0.1794 0.1204 0.4642 0.0364 0.6369 0.3853 0.0779 0.0786 0.1497 0.0569 0.0932 0.2027 0.4003 0.6633 0.0171 0.0164 0.0167 0.1049 0.0220 0.0335 0.0466 0.4371 0.1141 0.4304 0.2665 0.1775 0.1464 0.1537 0.2916 0.3720 0.3137 0.6640 0.7770
missRanger object. Extract imputed data via $data
- best iteration: 2
- best average OOB imputation error: 0.2477098
Code
data_all_te <- seasonalData_lag_te_all_imp$data
data_all_te_matrix <- data_all_te %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
newData_te <- data_all_te %>%
filter(season == max(season, na.rm = TRUE)) %>%
select(-fantasyPoints_lag)
newData_te_matrix <- data_all_te_matrix[
data_all_te_matrix[, "season"] == max(data_all_te_matrix[, "season"], na.rm = TRUE), # keep only rows with the most recent season
, # all columns
drop = FALSE]
dropCol_te <- which(colnames(newData_te_matrix) == "fantasyPoints_lag")
newData_te_matrix <- newData_te_matrix[, -dropCol_te, drop = FALSE]
seasonalData_lag_te_train_imp <- missRanger::missRanger(
seasonalData_lag_te_train,
pmm.k = 5,
verbose = 2,
seed = 52242,
keep_forests = TRUE)
Variables to impute: games, years_of_experience, ageCentered20, ageCentered20Quadratic, fantasy_points, fantasy_points_ppr, fantasyPoints, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_2pt_conversions, special_teams_tds, receiving_epa, racr, air_yards_share, target_share, wopr, fantasyPoints_lag, rookie_year, draft_number, gs, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, adot.rec, rat.rec, drop_percent.rec, rec_br.rec, ybc_r.rec, yac_r.rec, rushing_epa, ybc_att.rush, yac_att.rush
Variables used to impute: gsis_id, season, games, gs, years_of_experience, age, ageCentered20, ageCentered20Quadratic, height, weight, rookie_year, draft_number, fantasy_points, fantasy_points_ppr, fantasyPoints, fantasyPoints_lag, carries, rushing_yards, rushing_tds, rushing_fumbles, rushing_fumbles_lost, rushing_first_downs, rushing_epa, rushing_2pt_conversions, receptions, targets, receiving_yards, receiving_tds, receiving_fumbles, receiving_fumbles_lost, receiving_air_yards, receiving_yards_after_catch, receiving_first_downs, receiving_epa, receiving_2pt_conversions, racr, target_share, air_yards_share, wopr, special_teams_tds, ybc_att.rush, yac_att.rush, att.rush, yds.rush, td.rush, x1d.rush, ybc.rush, yac.rush, brk_tkl.rush, att_br.rush, ybc_r.rec, yac_r.rec, adot.rec, rat.rec, tgt.rec, rec.rec, yds.rec, td.rec, x1d.rec, ybc.rec, yac.rec, brk_tkl.rec, drop.rec, int.rec, drop_percent.rec, rec_br.rec
games yrs_f_ agCn20 agC20Q fntsy_ fnts__ fntsyP carris rshng_y rshng_t rshng_f rshng_fm_ rshng_fr_ rsh_2_ rcptns targts rcvng_y rcvng_t rcvng_f rcvng_fm_ rcvng_r_ rcv___ rcvng_fr_ rcv_2_ spcl__ rcvng_p racr ar_yr_ trgt_s wopr fntsP_ rok_yr drft_n gs att.rs yds.rs td.rsh x1d.rs ybc.rs yc.rsh brk_tkl.rs att_b. tgt.rc rec.rc yds.rc td.rec x1d.rc ybc.rc yac.rc brk_tkl.rc drp.rc int.rc adt.rc rat.rc drp_p. rc_br. ybc_r. yc_r.r rshng_p ybc_t. yc_tt.
iter 1: 0.8094 0.1093 0.0070 0.0035 0.3272 0.0235 0.0052 0.2840 0.1426 0.2634 0.8628 0.7885 0.0924 1.1067 0.0298 0.0611 0.0249 0.0969 0.8177 0.4537 0.0650 0.0804 0.0249 0.9680 1.0235 0.1738 0.5438 0.0868 0.4123 0.1172 0.4597 0.0189 0.6057 0.3973 0.0877 0.0886 0.1516 0.0467 0.0593 0.2086 0.4018 0.6464 0.0296 0.0223 0.0237 0.1062 0.0207 0.0428 0.0579 0.4367 0.4700 0.4818 0.3045 0.1724 0.4722 0.2410 0.2693 0.4025 0.3943 0.4791 0.7521
iter 2: 0.1728 0.1469 0.0179 0.0289 0.0104 0.0039 0.0051 0.0863 0.0763 0.1528 0.7880 0.9474 0.0849 1.0234 0.0193 0.0198 0.0137 0.0915 0.4327 0.4835 0.0238 0.0558 0.0221 0.9617 1.0376 0.1464 0.5141 0.0630 0.1827 0.1062 0.4562 0.0379 0.6361 0.3908 0.0641 0.0767 0.1425 0.0603 0.0747 0.1970 0.3903 0.6647 0.0182 0.0171 0.0165 0.1074 0.0226 0.0332 0.0503 0.4361 0.1170 0.4096 0.2673 0.1862 0.2255 0.2386 0.2793 0.4004 0.3648 0.5070 0.7621
iter 3: 0.1713 0.1447 0.0195 0.0276 0.0104 0.0036 0.0051 0.0796 0.0889 0.1611 0.8505 0.9348 0.0904 1.0447 0.0196 0.0205 0.0134 0.0901 0.4465 0.4867 0.0233 0.0569 0.0222 0.9519 1.0103 0.1457 0.5062 0.0617 0.1698 0.1115 0.4530 0.0382 0.6521 0.3899 0.0665 0.0681 0.1457 0.0647 0.0866 0.2055 0.3919 0.6791 0.0169 0.0169 0.0168 0.1107 0.0213 0.0339 0.0500 0.4315 0.1200 0.4148 0.2745 0.1822 0.1947 0.2205 0.2778 0.4032 0.3639 0.4933 0.7741
missRanger object. Extract imputed data via $data
- best iteration: 2
- best average OOB imputation error: 0.2519559
Code
data_train_te <- seasonalData_lag_te_train_imp$data
data_train_te_matrix <- data_train_te %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()
seasonalData_lag_te_test_imp <- predict(
object = seasonalData_lag_te_train_imp,
newdata = seasonalData_lag_te_test,
seed = 52242)
data_test_te <- seasonalData_lag_te_test_imp
data_test_te_matrix <- data_test_te %>%
mutate(across(where(is.factor), ~ as.numeric(as.integer(.)))) %>%
as.matrix()19.5 Identify Cores for Parallel Processing
19.6 Fitting the Traditional Regression Models
19.6.1 Regression with One Predictor
19.6.2 Regression with Multiple Predictors
19.7 Fitting the Machine Learning Models
19.7.1 Least Absolute Shrinkage and Selection Option (LASSO)
19.7.2 Ridge Regression
19.7.3 Elastic Net
19.7.4 Random Forest Machine Learning
19.7.4.1 Cross-Sectional Data
We use the caret package (Kuhn, 2024). We use the parallel (R-parallel?) and doParallel (Corporation & Weston, 2022) packages for parallel (faster) processing.
Code
cl <- parallel::makeCluster(num_cores)
doParallel::registerDoParallel(cl)
set.seed(52242)
randomForest_qb <- caret::train(
fantasyPoints_lag ~ ., # use all predictors
data = seasonalData_lag_subsetQB_imp$ximp,
method = "rf",
trControl = trainControl(
method = "cv",
number = 10)) # 10-fold cross-validationError in eval(expr, p): object 'seasonalData_lag_subsetQB_imp' not found
Code
Error in eval(expr, p): object 'seasonalData_lag_subsetRB_imp' not found
Code
Error in eval(expr, p): object 'seasonalData_lag_subsetWR_imp' not found
Code
Error in eval(expr, p): object 'seasonalData_lag_subsetTE_imp' not found
Error: object 'randomForest_qb' not found
Error: object 'randomForest_rb' not found
Error: object 'randomForest_wr' not found
Error: object 'randomForest_te' not found
Code
Error: object 'randomForest_qb' not found
Code
Error: object 'randomForest_rb' not found
Code
Error: object 'randomForest_wr' not found
Code
Error: object 'randomForest_te' not found
Error: object 'newData_seasonalQB_imp' not found
Error: object 'newData_seasonalRB_imp' not found
Error: object 'newData_seasonalWR_imp' not found
Error: object 'newData_seasonalTE_imp' not found
Code
Error: object 'newData_seasonalQB' not found
Code
Error: object 'newData_seasonalRB' not found
Code
Error: object 'newData_seasonalWR' not found
Code
Error: object 'newData_seasonalTE' not found
Code
Error: object 'newData_seasonalQB' not found
Code
Error: object 'newData_seasonalRB' not found
Code
Error: object 'newData_seasonalWR' not found
Code
Error: object 'newData_seasonalTE' not found
19.7.4.2 Longitudinal Data
Code
library("LongituRF")
smerf <- LongituRF::MERF(
X = seasonalData_subsetQB_imp$ximp[,c("passing_epa")] %>% as.matrix(),
Y = seasonalData_subsetQB$fantasyPoints_lag,
Z = seasonalData_subsetQB_imp$ximp[,c("pacr")] %>% as.matrix(),
id = seasonalData_subsetQB$gsis_id,
time = seasonalData_subsetQB_imp$ximp[,c("ageCentered20")] %>% as.matrix(),
ntree = 500,
sto = "BM")
smerf$forest # the fitted random forest (obtained at the last iteration)
smerf$random_effects # the predicted random effects for each player
smerf$omega # the predicted stochastic processes
plot(smerf$Vraisemblance) # evolution of the log-likelihood
smerf$OOB # OOB error at each iteration19.7.5 k-Fold Cross-Validation
19.7.6 Leave-One-Out (LOO) Cross-Validation
19.7.7 Combining Tree-Boosting with Mixed Models
To combine tree-boosting with mixed models, we use the gpboost package (gpboost?).
Adapted from here: https://towardsdatascience.com/mixed-effects-machine-learning-for-longitudinal-panel-data-with-gpboost-part-iii-523bb38effc
19.7.7.1 Process Data
If using a gamma distribution, it requires positive-only values:
19.7.7.2 Specify Predictor Variables
19.7.7.3 Specify General Model Options
19.7.7.4 Identify Optimal Tuning Parameters
For identifying the optimal tuning parameters for boosting, we partition the training data into inner training data and validation data. We randomly split the training data into 80% inner training data and 20% held-out validation data. We then use the mean absolute error as our index of prediction accuracy on the held-out validation data.
Code
# Partition training data into inner training data and validation data
ntrain_qb <- dim(data_train_qb_matrix)[1]
set.seed(52242)
valid_tune_idx_qb <- sample.int(ntrain_qb, as.integer(0.2*ntrain_qb)) #
folds_qb <- list(valid_tune_idx_qb)
# Specify parameter grid, gp_model, and gpb.Dataset
param_grid_qb <- list(
"learning_rate" = c(0.2, 0.1, 0.05, 0.01),
"max_depth" = c(3, 5, 7),
"min_data_in_leaf" = c(10, 50, 100),
"lambda_l2" = c(0, 1, 5))
other_params_qb <- list(
num_leaves = 2^6) # 2^n, where n is smaller than the largest max_depth
gp_model_qb <- gpboost::GPModel(
group_data = data_train_qb_matrix[,"gsis_id"],
likelihood = model_likelihood,
group_rand_coef_data = cbind(
data_train_qb_matrix[,"ageCentered20"],
data_train_qb_matrix[,"ageCentered20Quadratic"]),
ind_effect_group_rand_coef = c(1,1))
gp_data_qb <- gpboost::gpb.Dataset(
data = data_train_qb_matrix[,pred_vars_qb],
categorical_feature = pred_vars_qb_categorical,
label = data_train_qb_matrix[,"fantasyPoints_lag"])
# Find optimal tuning parameters
opt_params_qb <- gpboost::gpb.grid.search.tune.parameters(
param_grid = param_grid_qb,
params = other_params_qb,
num_try_random = NULL,
folds = folds_qb,
data = gp_data_qb,
gp_model = gp_model_qb,
nrounds = nrounds,
early_stopping_rounds = 50,
verbose_eval = 1,
metric = "mae")
opt_params_qb$best_params
$best_params$learning_rate
[1] 0.2
$best_params$max_depth
[1] 7
$best_params$min_data_in_leaf
[1] 10
$best_params$lambda_l2
[1] 0
$best_iter
[1] 2000
$best_score
[1] 64.71963
A learning rate of 1 is very high for boosting. Even if a learning rate of 1 did well in tuning, I use a lower learning rate (0.2) to avoid overfitting. I also added some light regularization (lambda_l2) for better generalization. I also set the maximum tree depth (max_depth) at 7 to capture up to 3-way interactions, and set the maximum number of terminal nodes (num_leaves) per tree at 2^5 (32). I set the minimum number of samples in any leaf (min_data_in_leaf) to be 50.
19.7.7.5 Specify Model and Tuning Parameters
Code
gp_model_qb <- gpboost::GPModel(
group_data = data_train_qb_matrix[,"gsis_id"],
likelihood = model_likelihood,
group_rand_coef_data = cbind(
data_train_qb_matrix[,"ageCentered20"],
data_train_qb_matrix[,"ageCentered20Quadratic"]),
ind_effect_group_rand_coef = c(1,1))
gp_data_qb <- gpboost::gpb.Dataset(
data = data_train_qb_matrix[,pred_vars_qb],
categorical_feature = pred_vars_qb_categorical,
label = data_train_qb_matrix[,"fantasyPoints_lag"])
params_qb <- list(
learning_rate = 0.2, # 0.1,
max_depth = 7, #3,
min_data_in_leaf = 10, #50
lambda_l2 = 0, # 1,
num_leaves = 2^6, #2^5,
num_threads = num_cores)
nrounds_qb <- 2000 # identify optimal number of trees through iteration and cross-validation
#gp_model_qb$set_optim_params(params = list(optimizer_cov = "nelder_mead")) # to speed up model estimation19.7.7.6 Fit Model
Code
[GPBoost] [Info] Total Bins 8709
[GPBoost] [Info] Number of data points in the train set: 1582, number of used features: 73
[GPBoost] [Info] [GPBoost with gaussian likelihood]: initscore=111.481871
[GPBoost] [Info] Start training from score 111.481871
=====================================================
Covariance parameters (random effects):
Param.
Error_term 5157.8499
Group_1 6709.3113
Group_1_rand_coef_nb_1 4.6555
Group_1_rand_coef_nb_2 0.1018
=====================================================
19.7.7.7 Evaluate Accuracy of Model on Test Data
Code
# Test Model on Test Data
pred_test_qb <- predict(
gp_model_fit_qb,
data = data_test_qb_matrix[,pred_vars_qb],
group_data_pred = data_test_qb_matrix[,"gsis_id"],
group_rand_coef_data_pred = cbind(
data_test_qb_matrix[,"ageCentered20"],
data_test_qb_matrix[,"ageCentered20Quadratic"]),
predict_var = FALSE,
pred_latent = FALSE)
y_pred_test_qb <- pred_test_qb[["response_mean"]]
cbind(y_pred_test_qb, data_test_qb_matrix[,"fantasyPoints_lag"]) y_pred_test_qb
[1,] 110.9396 156.46
[2,] 110.9396 130.18
[3,] 110.1854 2.98
[4,] 110.4828 24.84
[5,] 110.4828 40.28
[6,] 110.4828 10.58
[7,] 110.4828 0.68
[8,] 109.7286 25.08
[9,] 109.7286 6.12
[10,] 110.4828 17.00
[11,] 110.4828 44.54
[12,] 110.0504 152.30
[13,] 110.1854 -0.10
[14,] 109.7286 137.96
[15,] 110.9396 154.78
[16,] 118.2726 7.64
[17,] 110.4828 6.10
[18,] 109.7286 228.66
[19,] 113.4060 207.44
[20,] 112.9786 23.80
[21,] 110.4828 263.06
[22,] 112.9786 157.30
[23,] 114.2201 174.48
[24,] 114.3535 228.56
[25,] 112.9786 75.36
[26,] 110.4828 6.72
[27,] 111.2378 75.06
[28,] 110.4828 173.40
[29,] 112.9786 161.88
[30,] 112.9786 81.36
[31,] 109.7286 19.86
[32,] 110.4828 74.86
[33,] 110.4828 48.92
[34,] 110.4828 97.48
[35,] 109.7286 32.84
[36,] 110.4828 -0.40
[37,] 118.7898 121.84
[38,] 110.4828 197.76
[39,] 112.9786 3.16
[40,] 111.3433 104.68
[41,] 110.4828 31.94
[42,] 111.3433 4.06
[43,] 109.7286 7.22
[44,] 113.7814 350.52
[45,] 111.3560 313.92
[46,] 114.2201 259.06
[47,] 118.1769 240.26
[48,] 118.2726 123.14
[49,] 110.4828 48.58
[50,] 109.7286 103.06
[51,] 110.4828 167.20
[52,] 116.6921 175.58
[53,] 111.5761 -0.20
[54,] 109.7286 71.82
[55,] 112.0433 244.56
[56,] 113.4060 156.12
[57,] 114.5369 222.18
[58,] 110.2097 17.78
[59,] 109.0526 1.54
[60,] 110.4828 134.74
[61,] 110.4828 182.74
[62,] 114.1969 177.76
[63,] 112.9786 14.66
[64,] 109.7286 19.84
[65,] 109.7286 150.30
[66,] 109.7286 44.56
[67,] 110.4828 40.26
[68,] 109.7286 86.84
[69,] 110.4828 5.46
[70,] 109.7286 43.82
[71,] 110.0504 303.70
[72,] 119.5381 271.52
[73,] 119.5381 235.56
[74,] 110.0504 230.12
[75,] 117.7914 310.10
[76,] 112.9786 165.78
[77,] 110.9396 201.68
[78,] 111.5761 219.56
[79,] 114.2201 255.66
[80,] 112.9786 248.32
[81,] 113.6054 193.18
[82,] 112.9786 66.94
[83,] 110.4828 44.54
[84,] 110.4828 11.04
[85,] 109.7286 67.98
[86,] 109.7286 6.12
[87,] 110.4828 12.52
[88,] 109.7286 -0.20
[89,] 109.7286 46.68
[90,] 109.7286 -0.10
[91,] 109.7286 9.26
[92,] 110.4828 -0.06
[93,] 109.7286 2.30
[94,] 111.3433 75.88
[95,] 110.9396 157.64
[96,] 112.9786 218.42
[97,] 112.9786 204.38
[98,] 114.2201 178.82
[99,] 111.5761 275.06
[100,] 114.9687 225.24
[101,] 114.2201 118.96
[102,] 114.6264 49.64
[103,] 109.7286 26.64
[104,] 109.7286 -0.10
[105,] 110.4828 18.02
[106,] 110.4983 35.82
[107,] 109.7286 -0.30
[108,] 109.6746 76.02
[109,] 109.7286 5.48
[110,] 110.4828 3.18
[111,] 110.4828 279.60
[112,] 109.7286 41.64
[113,] 118.2726 194.00
[114,] 105.2860 254.06
[115,] 117.7914 95.30
[116,] 110.4828 117.72
[117,] 114.6264 -0.10
[118,] 110.4828 3.00
[119,] 109.7286 25.82
[120,] 110.4828 2.68
[121,] 109.7286 115.54
[122,] 110.9396 0.20
[123,] 109.7286 -4.64
[124,] 109.7286 41.90
[125,] 110.4828 8.78
[126,] 110.4828 222.70
[127,] 112.9786 144.26
[128,] 110.1854 172.02
[129,] 110.9396 33.90
[130,] 109.7286 185.88
[131,] 114.2201 108.84
[132,] 110.4828 222.92
[133,] 116.0647 17.22
[134,] 109.7286 0.76
[135,] 109.7286 5.90
[136,] 110.4828 2.54
[137,] 109.7286 17.28
[138,] 110.4828 58.24
[139,] 109.7286 123.26
[140,] 110.9396 38.48
[141,] 109.7286 44.22
[142,] 110.4828 17.06
[143,] 110.4828 9.30
[144,] 110.4828 0.70
[145,] 108.1067 -0.30
[146,] 110.4828 11.16
[147,] 109.7286 7.86
[148,] 110.4828 5.62
[149,] 110.4828 1.26
[150,] 110.4828 3.12
[151,] 110.4828 0.02
[152,] 110.4828 51.52
[153,] 109.7286 0.66
[154,] 111.0211 80.12
[155,] 110.5121 156.14
[156,] 110.9396 103.18
[157,] 110.4828 3.50
[158,] 110.4828 86.94
[159,] 109.0526 -0.30
[160,] 110.4828 0.08
[161,] 110.4828 29.36
[162,] 113.8466 142.06
[163,] 111.5761 145.94
[164,] 110.4828 145.16
[165,] 110.4828 64.10
[166,] 110.4828 225.44
[167,] 114.2201 20.76
[168,] 110.4828 0.76
[169,] 110.4828 54.90
[170,] 110.4828 1.24
[171,] 109.7286 2.06
[172,] 110.4828 192.06
[173,] 109.6746 7.76
[174,] 110.4828 187.32
[175,] 116.2481 309.64
[176,] 114.2201 226.02
[177,] 114.2201 287.82
[178,] 113.4060 99.10
[179,] 110.4828 293.96
[180,] 114.2201 289.92
[181,] 113.5839 270.92
[182,] 112.9786 279.30
[183,] 114.2201 44.66
[184,] 110.4828 5.16
[185,] 110.4828 1.36
[186,] 110.4828 18.52
[187,] 110.4828 9.80
[188,] 110.4828 108.18
[189,] 110.4828 49.40
[190,] 110.4828 13.42
[191,] 109.6746 116.84
[192,] 110.4828 190.52
[193,] 114.2201 287.90
[194,] 111.8846 254.60
[195,] 112.9786 162.06
[196,] 116.6921 227.82
[197,] 112.9786 106.80
[198,] 110.4828 0.28
[199,] 110.4828 26.60
[200,] 110.4828 0.44
[201,] 110.4828 3.10
[202,] 110.4828 34.90
[203,] 110.4828 -0.40
[204,] 110.4828 20.94
[205,] 108.3542 228.48
[206,] 112.9786 193.86
[207,] 113.4060 208.34
[208,] 112.9786 202.52
[209,] 113.4060 249.34
[210,] 114.2201 245.08
[211,] 112.9786 294.82
[212,] 111.0706 238.92
[213,] 112.9786 176.32
[214,] 113.4060 277.50
[215,] 113.4060 303.38
[216,] 113.4060 232.18
[217,] 112.9786 207.32
[218,] 112.9786 252.96
[219,] 116.2641 57.38
[220,] 110.4828 7.78
[221,] 110.4828 10.96
[222,] 110.4828 22.32
[223,] 110.4828 68.06
[224,] 110.4828 1.50
[225,] 110.4828 0.08
[226,] 110.4828 5.20
[227,] 110.4066 7.14
[228,] 110.4828 -0.30
[229,] 110.4828 -0.40
[230,] 110.4828 11.40
[231,] 110.4828 14.52
[232,] 111.2378 183.14
[233,] 113.4060 46.46
[234,] 110.4828 148.50
[235,] 110.4828 138.80
[236,] 110.4828 224.66
[237,] 114.7810 128.68
[238,] 116.6921 263.22
[239,] 118.7898 228.00
[240,] 112.9786 277.24
[241,] 117.7914 238.78
[242,] 114.4194 305.18
[243,] 118.1769 147.00
[244,] 110.4828 75.58
[245,] 110.4828 11.04
[246,] 110.4828 30.08
[247,] 110.4828 13.06
[248,] 110.4828 10.80
[249,] 110.4828 6.18
[250,] 110.4828 2.60
[251,] 110.4983 0.36
[252,] 111.0211 40.58
[253,] 110.4828 17.22
[254,] 110.4828 -0.50
[255,] 110.4828 91.04
[256,] 109.7286 4.10
[257,] 109.6869 9.38
[258,] 110.4828 25.36
[259,] 110.4828 -2.04
[260,] 110.4828 46.14
[261,] 110.4828 71.48
[262,] 110.4828 105.70
[263,] 110.4828 88.76
[264,] 110.4828 86.30
[265,] 110.4828 8.66
[266,] 110.4828 60.50
[267,] 110.4828 95.74
[268,] 110.4828 -0.14
[269,] 110.4828 9.38
[270,] 111.3433 51.46
[271,] 110.4828 21.24
[272,] 110.4828 165.66
[273,] 110.4828 67.36
[274,] 110.4828 24.64
[275,] 109.7286 0.56
[276,] 110.4828 56.74
[277,] 110.4828 228.62
[278,] 105.5396 3.62
[279,] 110.4828 30.70
[280,] 110.4828 -0.20
[281,] 110.4828 1.10
[282,] 110.9080 0.64
[283,] 110.4828 6.34
[284,] 110.4828 3.32
[285,] 110.4828 68.88
[286,] 110.4828 16.50
[287,] 110.4828 11.36
[288,] 110.4828 1.54
[289,] 109.7286 6.08
[290,] 110.4828 0.58
[291,] 109.7286 15.80
[292,] 110.4828 40.20
[293,] 110.4828 163.94
[294,] 112.9786 172.64
[295,] 110.4828 86.60
[296,] 116.6921 68.88
[297,] 110.4828 -0.52
[298,] 110.4828 23.08
[299,] 110.4828 4.00
[300,] 109.7286 12.28
[301,] 110.4828 1.48
[302,] 109.7286 4.94
[303,] 110.4828 10.36
[304,] 117.0331 277.44
[305,] 113.4168 222.08
[306,] 113.7931 254.50
[307,] 112.9786 32.44
[308,] 118.2726 10.36
[309,] 109.7286 26.08
[310,] 110.2097 14.08
[311,] 110.4828 90.80
[312,] 109.7286 4.14
[313,] 110.4828 0.64
[314,] 110.4828 15.82
[315,] 110.4828 35.16
[316,] 110.4828 8.72
[317,] 110.4828 1.36
[318,] 110.4828 -0.06
[319,] 109.6746 52.34
[320,] 111.8010 -0.10
[321,] 109.7286 10.30
[322,] 111.0211 3.60
[323,] 109.7286 7.40
[324,] 109.7286 7.30
[325,] 110.4983 10.80
[326,] 111.3433 6.30
[327,] 109.7286 1.00
[328,] 111.3433 19.64
[329,] 112.9786 105.16
[330,] 110.4828 238.48
[331,] 113.1779 122.58
[332,] 110.4828 209.90
[333,] 114.2201 237.88
[334,] 112.9786 26.98
[335,] 110.4828 16.70
[336,] 110.4828 184.74
[337,] 111.3560 335.46
[338,] 114.7119 303.66
[339,] 119.5381 266.98
[340,] 118.7898 402.08
[341,] 111.3560 261.26
[342,] 117.9369 308.98
[343,] 119.5381 284.60
[344,] 118.7898 21.68
[345,] 110.4828 268.98
[346,] 105.2860 90.36
[347,] 110.0504 151.42
[348,] 110.4828 142.14
[349,] 110.4828 103.74
[350,] 110.4828 69.92
[351,] 110.4828 14.52
[352,] 110.4828 182.06
[353,] 111.5761 277.28
[354,] 118.7898 266.66
[355,] 120.9336 116.20
[356,] 109.0526 213.44
[357,] 105.2860 103.92
[358,] 109.8052 7.42
[359,] 110.4828 3.52
[360,] 110.4828 0.56
[361,] 110.4828 24.80
[362,] 110.4828 43.02
[363,] 110.4828 150.30
[364,] 110.4828 1.76
[365,] 110.4828 21.44
[366,] 110.4828 24.48
[367,] 110.4828 6.32
[368,] 110.4828 54.44
[369,] 110.4828 0.04
[370,] 110.4828 64.58
[371,] 112.9786 93.64
[372,] 110.4828 20.02
[373,] 110.4828 68.42
[374,] 110.4828 -0.10
[375,] 110.4828 30.32
[376,] 114.4194 256.32
[377,] 114.2201 294.50
[378,] 120.3206 278.32
[379,] 114.2201 202.20
[380,] 114.2201 151.96
[381,] 110.4828 234.18
[382,] 111.5761 352.36
[383,] 108.3754 280.86
[384,] 114.5092 167.24
[385,] 110.4828 85.04
[386,] 110.4828 -0.20
[387,] 110.4828 0.72
[388,] 110.4828 -1.46
[389,] 110.4828 6.46
[390,] 110.4828 11.68
[391,] 110.4828 1.70
[392,] 110.4828 30.32
[393,] 110.4828 0.76
[394,] 110.4828 6.72
[395,] 110.4828 12.16
[396,] 110.4828 19.40
[397,] 110.4828 0.00
[398,] 110.4828 14.52
[399,] 108.1067 102.00
[400,] 105.9670 122.04
[401,] 109.7286 196.74
[402,] 113.4060 132.10
[403,] 109.7286 0.12
[404,] 109.7286 94.16
[405,] 113.7814 10.16
[406,] 109.7286 34.96
[407,] 109.7286 8.56
[408,] 110.4828 -0.48
[409,] 108.1067 18.80
[410,] 110.4828 28.00
[411,] 111.2378 17.98
[412,] 108.1067 -3.24
[413,] 110.4828 1.80
[414,] 109.7286 -1.90
[415,] 110.4828 16.80
[416,] 110.4828 0.38
[417,] 109.7286 0.00
[418,] 110.4828 26.02
[419,] 110.4828 8.78
[420,] 109.7286 0.20
Code
19.7.7.8 Generate Predictions for Next Season
Code
# Generate model predictions for next season
pred_nextYear_qb <- predict(
gp_model_fit_qb,
data = newData_qb_matrix[,pred_vars_qb],
group_data_pred = newData_qb_matrix[,"gsis_id"],
group_rand_coef_data_pred = cbind(
newData_qb_matrix[,"ageCentered20"],
newData_qb_matrix[,"ageCentered20Quadratic"]),
predict_var = FALSE,
pred_latent = FALSE)
newData_qb$fantasyPoints_lag <- pred_nextYear_qb$response_mean
# Merge with player names
newData_qb <- left_join(
newData_qb,
nfl_playerIDs %>% select(gsis_id, name),
by = "gsis_id"
)Error: object 'nfl_playerIDs' not found
Error in `select()`:
! Can't select columns that don't exist.
✖ Column `name` doesn't exist.
19.8 Conclusion
19.9 Session Info
R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: UTC
tzcode source: system (glibc)
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[5] purrr_1.0.4 readr_2.1.5 tidyr_1.3.1 tibble_3.3.0
[9] tidyverse_2.0.0 gpboost_1.5.8 R6_2.6.1 caret_7.0-1
[13] lattice_0.22-7 ggplot2_3.5.2 powerjoin_0.1.0 missRanger_2.6.1
[17] doParallel_1.0.17 iterators_1.0.14 foreach_1.5.2 petersenlab_1.1.5
loaded via a namespace (and not attached):
[1] DBI_1.2.3 mnormt_2.1.1 pROC_1.18.5
[4] gridExtra_2.3 rlang_1.1.6 magrittr_2.0.3
[7] compiler_4.5.1 vctrs_0.6.5 reshape2_1.4.4
[10] quadprog_1.5-8 pkgconfig_2.0.3 fastmap_1.2.0
[13] backports_1.5.0 pbivnorm_0.6.0 rmarkdown_2.29
[16] prodlim_2025.04.28 tzdb_0.5.0 xfun_0.52
[19] jsonlite_2.0.0 recipes_1.3.1 psych_2.5.6
[22] lavaan_0.6-19 cluster_2.1.8.1 stringi_1.8.7
[25] RColorBrewer_1.1-3 ranger_0.17.0 parallelly_1.45.0
[28] rpart_4.1.24 Rcpp_1.0.14 knitr_1.50
[31] future.apply_1.20.0 base64enc_0.1-3 FNN_1.1.4.1
[34] Matrix_1.7-3 splines_4.5.1 nnet_7.3-20
[37] timechange_0.3.0 tidyselect_1.2.1 rstudioapi_0.17.1
[40] yaml_2.3.10 timeDate_4041.110 codetools_0.2-20
[43] listenv_0.9.1 plyr_1.8.9 withr_3.0.2
[46] evaluate_1.0.4 foreign_0.8-90 future_1.58.0
[49] survival_3.8-3 pillar_1.10.2 checkmate_2.3.2
[52] stats4_4.5.1 generics_0.1.4 mix_1.0-13
[55] hms_1.1.3 scales_1.4.0 globals_0.18.0
[58] xtable_1.8-4 class_7.3-23 glue_1.8.0
[61] Hmisc_5.2-3 tools_4.5.1 data.table_1.17.6
[64] ModelMetrics_1.2.2.2 gower_1.0.2 mvtnorm_1.3-3
[67] grid_4.5.1 mitools_2.4 ipred_0.9-15
[70] colorspace_2.1-1 nlme_3.1-168 RJSONIO_2.0.0
[73] htmlTable_2.4.3 Formula_1.2-5 cli_3.6.5
[76] viridisLite_0.4.2 lava_1.8.1 gtable_0.3.6
[79] digest_0.6.37 htmlwidgets_1.6.4 farver_2.1.2
[82] htmltools_0.5.8.1 lifecycle_1.0.4 hardhat_1.4.1
[85] MASS_7.3-65